Differential expression in RNA-seq: a matter of depth.

نویسندگان

  • Sonia Tarazona
  • Fernando García-Alcalde
  • Joaquín Dopazo
  • Alberto Ferrer
  • Ana Conesa
چکیده

Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regulatory effects of cis- and trans-LncRNAs on differential expression of genes following infection with viral hemorrhagic septicemia virus in rainbow trout (Oncorhynchus mykiss)

In this study the cis and trans regulatory effect of long non-coding genes (lncRNA) on the expression of genes in fish infected by Viral hemorrhagic septicemia virus (VHS) was investigated using RNA-seq technology. At the end of experimental period (the thirty fifth day), total RNA was extracted from spleen tissue (group treated with virus) and physiological serum (control group) was used to pr...

متن کامل

Investigating the Function of Predicted Proteins from RNA-Seq Data in Holstein and Cholistani Cattle Breeds

This study was performed to determine the digital expression profile of different genes expressed in Holstein and Cholistani breeds as well as to evaluate the performance of predicted proteins derived from differentially expressed genes between these two breeds using RNA-Seq data. For this purpose, the whole mRNA sequence for a blood sample of American Holstein and Pakistani Cholistani cattle p...

متن کامل

Power analysis and sample size estimation for RNA-Seq differential expression.

It is crucial for researchers to optimize RNA-seq experimental designs for differential expression detection. Currently, the field lacks general methods to estimate power and sample size for RNA-Seq in complex experimental designs, under the assumption of the negative binomial distribution. We simulate RNA-Seq count data based on parameters estimated from six widely different public data sets (...

متن کامل

The Utility of Shallow RNA-Seq for Documenting Differential Gene Expression in Genes with High and Low Levels of Expression

The sequencing depth necessary for documenting differential gene expression using RNA-Seq has been little explored outside of model systems. In particular, the depth required to analyze large-scale patterns of differential transcription factor expression is not known. The goal of the present study is to explore the effectiveness of shallow (relatively low read depth) RNA-Seq. We focus on two ti...

متن کامل

BioC2014: RNA-Seq workflow for differential gene expression

This lab will walk you through an end-to-end RNA-Seq differential expression workflow. We will start from the FASTQ files, align to the reference genome, prepare gene expression values as a count table by counting the sequenced fragments, perform differential gene expression analysis, and visually explore the results. This lab covers the basic introduction. For a more in-depth explanation of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 21 12  شماره 

صفحات  -

تاریخ انتشار 2011